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(57) Abstract 

A viral vector production system is provided which system comprises: (i) a viral genome comprising at least one first nucleotide 
sequence encoding a gene product capable of binding to and effecting the cleavage, directly or indirectly, of a second nucleotide sequence, 
or transcription product thereof, encoding a viral polypeptide required for the assembly of viral particles; (ii) a third nucleotide sequence 
encoding said viral polypeptide required for the assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third nucleotide sequence, or transcription product thereof, 
is resistant to cleavage directed by said gene product. The viral vector production system may be used to produce viral particles for use in 
treating or preventing viral infection. 
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Field of the Invention 

5 The present invention relates to novel viral vectors capable of delivering anti-viral 
inhibitory RNA molecules to target cells. 

Background to the Invention 

10 The application of gene therapy to the treatment of AIDS and HIV infection has been 
discussed widely (14). The types of therapeutic gene proposed usually fall into one of two 
broad categories. In the first the gene encodes protein products that inhibit the virus in a 
number of possible ways. One example of such a protein is the RevMlO derivative of the 
HIV Rev protein (16). The RevMlO protein acts as a transdominant negative mutant and 

15 so competitively inhibits Rev function in the virus. Like many of the protein-based 
strategies, the RevMlO protein is a derivative of a native HIV protein. While this provides 
the basis for the anti-HIV effect, it also has serious disadvantages. In particular, this type 
of strategy demands that in the absence of the virus there is little or no expression of the 
gene. Otherwise, healthy cells harbouring the gene become a target for the host cytotoxic 

20 T lymphocyte (CTL) system, which recognises the foreign protein (17, 25). The second 
broad category of therapeutic gene circumvents these CTL problems. The therapeutic gene 
encodes inhibitory RNA molecules; RNA is not a target for CTL recognition. The RNA 
molecules may be anti-sense RNA (15, 31), ribozymes (5) or competitive decoys (1). 

25 Ribozymes are enzymatic RNA molecules which catalyse sequence-specific RNA 
processing. The design and structure of ribozymes has been described extensively in the 
literature in recent years (3, 7, 31). Amongst the most powerful systems are those that 
deliver multitarget ribozymes that cleave RNA of the target virus at multiple sites (5, 21). 

30 In recent years a number of laboratories have developed retroviral vector systems based on 
HIV (2, 4, 1 8, 1 9, 22-24, 27, 32, 35, 39, 43). In the context of anti-HIV gene therapy these 
vectors have a number of advantages over the more conventional murine based vectors 
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such as murine leukaemia virus (MLV) vectors. Firstly, HIV vectors would target 
precisely those cells that are susceptible to HIV infection (22, 23). Secondly, the HIV- 
based vector would transduce cells such as macrophages that are normally refractory to 
transduction by murine vectors (19, 20). Thirdly, the anti-HIV vector genome would be 
5 propagated through the CD4+ cell population by any virus (HIV) that escaped the 
therapeutic strategy (7). This is because the vector genome has the packaging signal that 
will be recognised by the viral particle packaging system. These various attributes make 
HIV-vectors a powerful tool in the field of anti-HIV gene therapy. 

10 A combination of the multitarget ribozyme and an HIV-based vector would be attractive as 
a therapeutic strategy. However, until now this has not been possible. Vector particle 
production takes place in producer cells which express the packaging components of the 
particles and package the vector genome. The ribozymes that are designed to destroy the 
viral RNA would therefore also interrupt the expression of the components of the HIV- 

15 based vector system during vector production. The present invention aims to overcome 
this problem. 

Summary of the Invention 

20 It is therefore an object of the invention to provide a system and method for producing viral 
particles, in particular HIV particles, which carry nucleotide constructs encoding inhibitory 
RNA molecules such as ribozymes and/or antisense RNAs directed against a 
corresponding virus, such as HIV, within a target cell, that overcomes the above-mentioned 
problems. The system includes both a viral genome encoding the inhibitory RNA 

25 molcules and nucleotide constructs encoding the components required for packaging the 
viral genome in a producer cell. However, in contrast to the prior art, although the 
packaging components have substantially the same amino acid sequence as the 
corresponding components of the target virus, the inhibitory RNA molecules do not affect 
production of the viral particles in the producer cells because the nucleotide sequence of 

30 the packaging components used in the viral system have been modified to prevent the 
inhibitory RNA molecules from effecting cleavage or degradation of the RNA transcripts 
produced from the constructs. Such a viral particle may be used to treat viral infections, in 
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Accordingly the present invention provides a viral vector system comprising: 

(i) a first nucleotide sequence encoding a gene product capable of binding to and 
5 effecting the cleavage, directly or indirectly, of a second nucleotide sequence, or 

transcription product thereof, encoding a viral polypeptide required for the assembly of 
viral particles; and 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of viral particles, which third nucleotide sequence has a different nucleotide 

10 sequence to the second nucleotide sequence such that the third nucleotide sequence, or 
transcription product thereof, is resistant to cleavage directed by said gene product. 

In another aspect, the present invention provides a viral vector production system 
comprising: 

15 (i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 
for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
20 assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product. 

25 The gene product is typically an RNA inhibitory sequence selected from a ribozyme and an 
anti-sense ribonucleic acid, preferably a ribozyme. 

Preferably, the viral vector is a retroviral vector, more preferably a lentiviral vector, such as 
an HIV vector. The second nucleotide sequence and the third nucleotide sequences are 
30 typically from the same viral species, more preferably from the same viral strain. 
Generally, the viral genome is also from the same viral species, more preferably from the 
same viral strain. 
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In the case of retroviral vectors, the polypeptide required for the assembly of viral particles 
is selected from gag, pol and env proteins. Preferably at least the gag and pol sequences 
are lentiviral sequences, more preferably HIV sequences. Alternatively, or in addition, the 
5 env sequence is a lentiviral sequence, more preferably an HIV sequence. 

In a preferred embodiment, the third nucleotide sequence is resistant to cleavage directed 
by the gene product as a result of one or more conservative alterations in the nucleotide 
sequence which remove cleavage sites recognised by the at least one gene product and/or 
10 binding sites for the at least one gene product. For example, where the gene product is a 
ribozyme, the third nucleotide sequence is adapted to be resistant to cleavage by the 
ribozyme. 

Preferably the third nucleotide sequence is codon optimised for expression in host cells. 
15 The host cells, which term includes producer cells and packaging cells, are typically 
mammalian cells. 

In a particularly preferred embodiment, (i) the viral genome is an HIV genome comprising 
nucleotide sequences encoding anti-HIV ribozymes and/or anti-HIV antisense sequences 

20 directed against HIV packaging component sequences (such as gag.pol) in a target HIV 
and (ii) the viral system for producing packaged HIV particles further comprises nucleotide 
constructs encoding the same packaging components (such as gag.pol proteins) as in the 
target HIV wherein the sequence of the nucleotide constructs is different from that found in 
the target HIV so that the anti-HIV ribozyme and/or antisense HIV sequences cannot effect 

25 cleavage or degradation of the gag.pol transcripts during production of the HIV particles in 
producer cells. 

The present invention also provides a viral particle comprising a viral vector according to 
the present invention and one or more polypeptides encoded by the third nucleotide 
30 sequences according to the present invention. For example the present invention provides 
a viral particle produced using the viral vector production system of the invention. 



In another aspect, the present invention provides a method for producing a viral particle 
which method comprises introducing into a host cell (i) a viral genome vector according to 
the present invention; (ii) one or more third nucleotide sequences according to the present 
invention; and (iii) nucleotide sequences encoding the other essential viral packaging 
5 components not encoded by the one or more third nucleotide sequences. 

The present invention further provides a viral particle produced using by the method of the 
invention. 



10 The present invention also provides a pharmaceutical composition comprising a viral 
particle according to the present invention together with a pharmaceutically acceptable 
carrier or diluent. 



The viral system of the invention or viral particles of the invention may be used to treat 
15 viral infections, particularly retroviral infections such as lentiviral infections including HIV 
infections. Thus the present invention provides a method of treating a viral infection which 
method comprises administering to a human or animal patient suffering from the viral 
infection an effective amount of a viral system, viral particle or pharmaceutical 
composition of the present invention. 

20 

The invention relates in particular to HIV-based vectors carrying anti-HIV ribozymes. 
However, the invention can be applied to any other virus, in particular any other lentivirus, 
for which treatment by gene therapy may be desirable. The invention is illustrated herein 
for HIV, but this is not considered to limit the scope of the invention to HIV-based anti- 
25 HIV vectors. 



Detailed Description of the Invention 



The term "viral vector" refers to a nucleotide construct comprising a viral genome capable 
30 of being transcribed in a host cell, which genome comprises sufficient viral genetic 
information to allow packaging of the viral RNA genome, in the presence of packaging 
components, into a viral particle capable of infecting a target cell. Infection of the target 
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cell includes reverse transcription and integration into the target cell genome, where 
appropriate for particular viruses. The viral vector in use typically carries heterologous 
coding sequences (nucleotides of interest) which are to be delivered by the vector to the 
target cell, for example a first nucleotide sequence encoding a ribozyme. A viral vector is 
5 incapable of independent replication to produce infectious viral particles within the final 
target cell. 



The term " viral vector system" is intended to mean a kit of parts which can be used when 
combined with other necessary components for viral particle production to produce viral 

10 particles in host cells. For example, the first nucleotide sequence may typically be present 
in a plasmid vector construct suitable for cloning the first nucleotide sequence into a viral 
genome vector construct. When combined in a kit with a third nucleotide sequence, which 
will also typically be present in a separate plasmid vector construct, the resulting 
combination of plasmid containing the first nucleotide sequence and plasmid containing 

15 the third nucleotide sequence comprises the essential elements of the invention. Such a kit 
may then be used by the skilled person in the production of suitable viral vector genome 
constructs which when transfected into a host cell together with the plasmid containing the 
third nucleotide sequence, and optionally nucleic acid constructs encoding other 
components required for viral assembly, will lead to the production of infectious viral 

20 particles. 



Alternatively, the third nucleotide sequence may be stably present within a packaging cell 
line that is included in the kit. 



25 The kit may include the other components needed to produce viral particles, such as host 
cells and other plasmids encoding essential viral polypeptides required for viral assembly. 
By way of example, the kit may contain (i) a plasmid containing a first nucleotide sequence 
encoding an anti-HIV ribozyme and (ii) a plasmid containing a third nucleotide sequence 
encoding a modified HIV gag.pol construct which cannot be cleaved by the anti-HIV 

30 ribozyme. Optional components would then be (a) an HIV viral genome construct with 
suitable restriction enzyme recognition sites for cloning the first nucleotide sequence into 
the viral genome; (b) a plasmid encoding a VSV-G env protein. Alternatively, nucleotide 
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sequence encoding viral polypeptides required for assembly of viral particles may be 
provided in the kit as packaging cell lines comprising the nucleotide sequences, for 
example a VSV-G expressing cell line. 

The term "viral vector production system" refers to the viral vector system described above 
wherein the first nucleotide sequence has already been inserted into a suitable viral vector 



Viral vectors are typically retroviral vectors, in particular lentiviral vectors such as HIV 
10 vectors. The retroviral vector of the present invention may be derived from or may be 
derivable from any suitable retrovirus. A large number of different retroviruses have been 
identified. Examples include: murine leukemia virus (MLV), human immunodeficiency 
virus (HIV), simian immunodeficiency virus, human T-cell leukemia virus (HTLV). 
equine infectious anaemia virus (EIAV), mouse mammary tumour virus (MMTV), Rous 
15 sarcoma virus (RSV), Fujinami sarcoma virus (FuSV), Moloney murine leukemia virus 
(Mo-MLV), FBR murine osteosarcoma virus (FBR MSV), Moloney murine sarcoma virus 
(Mo-MSV), Abelson murine leukemia virus (A-MLV), Avian myelocytomatosis virus-29 
(MC29), and Avian erythroblastosis virus (AEV). A detailed list of retroviruses may be 
found in Coffin et al, 1997, "Retroviruses", Cold Spring Harbour Laboratory Press Eds: 
20 JM Coffin, SM Hughes, HE Varmus pp 758-763 . 



Details on the genomic structure of some retroviruses may be found in the art. By way of 
example, details on HIV and Mo-MLV may be found from the NCBI Genbank (Genome 
Accession Nos. AF0338 19 and AF03 38 11, respectively). 

25 

The lenti virus group can be split even further into "primate" and "non-primate". Examples 
of primate lentiviruses include human immunodeficiency virus (HIV), the causative agent 
of human auto-immunodeficiency syndrome (AIDS), and simian immunodeficiency virus 
(SIV). The non-primate lentiviral group includes the prototype "slow virus" visna/maedi 
30 virus (VMV), as well as the related caprine arthritis-encephalitis virus (CAEV), equine 
infectious anaemia virus (EIAV) and the more recently described feline immunodeficiency 
virus (FIV) and bovine immunodeficiency virus (BIV). 
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The basic structure of a retrovirus genome is a 5' LTR and a 3' LTR, between or within 
which are located a packaging signal to enable the genome to be packaged, a primer 
binding site, integration sites to enable integration into a host cell genome and gag, pol and 
5 env genes encoding the packaging components - these are polypeptides required for the 
assembly of viral particles. More complex retroviruses have additional features, such as 
rev and RRE sequences in HIV, which enable the efficient export of RNA transcripts of the 
integrated provirus from the nucleus to the cytoplasm of an infected target cell. 

10 In the provirus, these genes are flanked at both ends by regions called long terminal repeats 
(LTRs). The LTRs are responsible for proviral integration, and transcription. LTRs also 
serve as enhancer-promoter sequences and can control the expression of the viral genes. 
Encapsidation of the retroviral RNAs occurs by virtue of a psi sequence located at the 5' 
end of the viral genome. 

15 

The LTRs themselves are identical sequences that can be divided into three elements, 
which are called U3, R and U5. U3 is derived from the sequence unique to the 3' end of 
the RNA. R is derived from a sequence repeated at both ends of the RNA and U5 is 
derived from the sequence unique to the 5' end of the RNA. The sizes of the three 
20 elements can vary considerably among different retroviruses. 

In a defective retroviral vector genome gag, pol and env may be absent or not functional. 
The R regions at both ends of the RNA are repeated sequences. U5 and U3 represent 
unique sequences at the 5' and 3' ends of the RNA genome respectively. 

25 

In a typical retroviral vector for use in gene therapy, at least part of one or more of the gag, 
pol and env protein coding regions essential for replication may be removed from the virus. 
This makes the retroviral vector replication-defective. The removed portions may even be 
replaced by a nucleotide sequence of interest (NOI), such as a first nucleotide sequence of 
30 the invention, to generate a virus capable of integrating its genome into a host genome but 
wherein the modified viral genome is unable to propagate itself due to a lack of structural 
proteins. When integrated in the host genome, expression of the NOI occurs - resulting in, 



for example, a therapeutic and/or a diagnostic effect. Thus, the transfer of an NOI into a 
site of interest is typically achieved by: integrating the NOI into the recombinant viral 
vector; packaging the modified viral vector into a virion coat; and allowing transduction of 
a site of interest - such as a targeted cell or a targeted cell population. 

5 

A minimal retroviral genome for use in the present invention will therefore comprise (5') R 
- U5 - one or more first nucleotide sequences - U3-R (3'). However, the plasmid vector 
used to produce the retroviral genome within a host cell/packaging cell will also include 
transcriptional regulatory control sequences operably linked to the retroviral genome to 
10 direct transcription of the genome in a host cell/packaging cell. These regulatory 
sequences may be the natural sequences associated with the transcribed retroviral sequence, 
i.e. the 5' U3 region, or they may be a heterologous promoter such as another viral 
promoter, for example the CMV promoter. 



15 Some retroviral genomes require additional sequences for efficient virus production. For 
example, in the case of HIV, rev and RRE sequence are preferably included. However the 
requirement for rev and RRE can be reduced or eliminated by codon optimisation. 



Once the retroviral vector genome is integrated into the genome of its target cell as proviral 
20 DNA, the ribozyme sequences need to be expressed. In a retrovirus, the promoter is 
located in the 5' LTR U3 region of the provirus. In retroviral vectors, the promoter driving 
expression of a therapeutic gene may be the native retroviral promoter in the 5' U3 region, 
or an alternative promoter engineered into the vector. The alternative promoter may 
physically replace the 5' U3 promoter native to the retrovirus, or it may be incorporated at 
25 a different place within the vector genome such as between the LTRs. 



Thus, the first nucleotide sequence will also be operably linked to a transcriptional 
regulatory control sequence to allow transcription of the first nucleotide sequence to occur 
in the target cell. The control sequence will typically be active in mammalian cells. The 
30 control sequence may, for example, be a viral promoter such as the natural viral promoter 
or a CMV promoter or it may be a mammalian promoter. It is particularly preferred to use 
a promoter that is preferentially active in a particular cell type or tissue type in which the 



virus to be treated primarily infects. Thus, in one embodiment, a tissue-specific regulatory 
sequences may be used. The regulatory control sequences driving expression of the one or 
more first nucleotide sequences may be constitutive or regulated promoters. 

5 Replication-defective retroviral vectors are typically propagated, for example to prepare 
suitable titres of the retroviral vector for subsequent transduction, by using a combination 
of a packaging or helper cell line and the recombinant vector. That is to say, that the three 
packaging proteins can be provided in trans. 

10 A "packaging cell line" contains one or more of the retroviral gag, pol and env genes. The 
packaging cell line produces the proteins required for packaging retroviral DNA but it 
cannot bring about encapsidation due to the lack of a psi region. However, when a 
recombinant vector carrying an NOI and a psi region is introduced into the packaging cell 
line, the helper proteins can package the /wz'-positive recombinant vector to produce the 

15 recombinant virus stock. This virus stock can be used to transduce cells to introduce the 
NOI into the genome of the target cells. It is preferred to use a psi packaging signal, called 
psi plus, that contains additional sequences spanning from upstream of the splice donor to 
downstream of the gag start codon (Bender et al. (46)) since this has been shown to 
increase viral titres. 

20 

The recombinant virus whose genome lacks all genes required to make viral proteins can 
tranduce only once and cannot propagate. These viral vectors which are only capable of a 
single round of transduction of target cells are known as replication defective vectors. 
Hence, the NOI is introduced into the host/target cell genome without the generation of 
25 potentially harmful retrovirus. A summary of the available packaging lines is presented in 
Coffin etal., 1997 (ibid). 

Retroviral packaging cell lines in which the gag pol and env viral coding regions are 
carried on separate expression plasmids that are independently transfected into a packaging 
30 cell line are preferably used. This strategy, sometimes referred to as the three plasmid 
transfection method (Soneoka et al. (33)), reduces the potential for production of a 
replication-competent virus since three recombinant events are required for wild type viral 



production. As recombination is greatly facilitated by homology, reducing or eliminating 
homology between the genomes of the vector and the helper can also be used to reduce the 
problem of replication-competent helper virus production. 

5 An alternative to stably transfected packaging cell lines is to use transiently transfected cell 
lines. Transient transfections may advantageously be used to measure levels of vector 
production when vectors are being developed. In this regard, transient transfection avoids 
the longer time required to generate stable vector-producing cell lines and may also be used 
if the vector or retroviral packaging components are toxic to cells. Components typically 

10 used to generate retroviral vectors include a plasmid encoding the gag/pol proteins, a 
plasmid encoding the env protein and a plasmid containing an NOI. Vector production 
involves transient transfection of one or more of these components into cells containing the 
other required components. If the vector encodes toxic genes or genes that interfere with 
the replication of the host cell, such as inhibitors of the cell cycle or genes that induce 

15 apotosis, it may be difficult to generate stable vector-producing cell lines, but transient 
transfection can be used to produce the vector before the cells die. Also, cell lines have 
been developed using transient transfection that produce vector titre levels that are 
comparable to the levels obtained from stable vector-producing cell lines (Pear et al. (47)). 

20 Producer cells/packaging cells can be of any suitable cell type. Most commonly, 
mammalian producer cells are used but other cells, such as insect cells are not excluded. 
Clearly, the producer cells will need to be capable of efficiently translating the env and 
gag, pol mRNA. Many suitable producer/packaging cell lines are known in the art. The 
skilled person is also capable of making suitable packaging cell lines by, for example 

25 stably introducing a nucleotide construct encoding a packaging component into a cell line. 

As will be discussed below, where the retroviral genome encodes an inhibitory RNA 
molecule capable of effecting the cleavage of gag, pol and/or env RNA transcripts, the 
nucleotide sequences present in the packaging cell line, either integrated or carried on 
30 plasmids, or in the transiently transfected producer cell line, which encode gag, pol and or 
env proteins will be modified so as to reduce or prevent binding of the inhibitory RNA 
molecule(s). In this way, the inhibitory RNA molecule(s) will not prevent expression of 



components in packaging cell lines that are essential for packaging of viral particles. 

It is highly desirable to use high-titre virus preparations in both experimental and practical 
applications. Techniques for increasing viral titre include using a psi plus packaging signal 
5 as discussed above and concentration of viral stocks. In addition, the use of different 
envelope proteins, such as the G protein from vesicular-stomatitis virus has improved titres 
following concentration to 10 9 per ml (Cosset et al. (48)). However, typically the envelope 
protein will be chosen such that the viral particle will preferentially infect cells that are 
infected with the virus which it desired to treat. For example where an HIV vector is being 
10 used to treat HIV infection, the env protein used will be the HIV env protein. 

Suitable first nucleotide sequences for use according to the present invention encode gene 
products that result in the cleavage and/or enzymatic degradation of a target nucleotide 
sequence, which will generally be a ribonucleotide. As particular examples, ribozymes, 
1 5 and antisense sequences may be mentioned. 

Ribozymes are RNA enzymes which cleave RNA at specific sites. Ribozymes can be 
engineered so as to be specific for any chosen sequence containing a ribozyme cleavage 
site. Thus, ribozymes can be engineered which have chosen recognition sites in transcribed 

20 viral sequences. By way of an example, ribozymes encoded by the first nucleotide 
sequence recognise and cleave essential elements of viral genomes required for the 
production of viral particles, such as packaging components. Thus, for retroviral genomes, 
such essential elements include the gag, pol and env gene products. A suitable ribozyme 
capable of recognising at least one of the gag, pol and env gene sequences, or more 

25 typically, the RNA sequences transcribed from these genes, is able to bind to and cleave 
such a sequence. This will reduce or prevent production of the gal, pol or env protein as 
appropriate and thus reduce or prevent the production of retroviral particles. 

Ribozymes come in several forms, including hammerhead, hairpin and hepatitis delta 
30 antigenomic ribozymes. Preferred for use herein are hammerhead ribozymes, in part 
because of their relatively small size, because the sequence requirements for their target 
cleavage site are minimal and because they have been well characterised. The ribozymes 



most commonly used in research at present are hammerhead and hairpin ribozymes. 

Each individual ribozyme has a motif which recognises and binds to a recognition site in 
the target RNA. This motif takes the form of one or more "binding arms", generally two 
5 binding arms. The binding arms in hammerhead ribozymes are the flanking sequences 
Helix I and Helix III, which flank Helix II. These can be of variable length, usually 
between 6 to 10 nucleotides each, but can be shorter or longer. The length of the flanking 
sequences can affect the rate of cleavage. For example, it has been found that reducing the 
total number of nucleotides in the flanking sequences from 20 to 12 can increase the 
10 turnover rate of the ribozyme cleaving a HIV sequence, by 10-fold (44). A catalytic motif 
in the ribozyme Helix II in hammerhead ribozymes cleaves the target RNA at a site which 
is referred to as the cleavage site. Whether or not a ribozyme will cleave any given RNA is 
determined by the presence or absence of a recognition site for the ribozyme containing an 
appropriate cleavage site. 

15 

Each type of ribozyme recognises its own cleavage site. The hammerhead ribozyme 
cleavage site has the nucleotide base triplet GUX directly upstream where G is guanine, U 
is uracil and X is any nucleotide base. Hairpin ribozymes have a cleavage site of 
BCUGNYR, where B is any nucleotide base other than adenine, N is any nucleotide, Y is 
20 cytosine or thymine and R is guanine or adenine. Cleavage by hairpin ribozymes takes 
places between the G and the N in the cleavage site. 



The nucleic acid sequences encoding the packaging components (the "third nucleotide 
sequences") may be resistant to the ribozyme or ribozymes because they lack any cleavage 

25 sites for the ribozyme or ribozymes. This prohibits enzymatic activity by the ribozyme or 
ribozymes and therefore there is no effective recognition site for the ribozyme or 
ribozymes. Alternatively or additionally, the potential recognition sites may be altered in 
the flanking sequences which form the part of the recognition site to which the ribozyme 
binds. This either eliminates binding of the ribozyme motif to the recognition site, or 

30 reduces binding capability enough to destabilise any ribozyme-target complex and thus 
reduce the specificity and catalytic activity of the ribozyme. Where the flanking sequences 
only are altered, they are preferably altered such that catalytic activity of the ribozyme at 
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the altered target sequence is negligible and is effectively eliminated. 
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Preferably, a series of several anti-fflV ribozymes is employed in the invention (5, 7, 10, 
13, 21, 36, 38, 40). These can be any anti-HIV ribozymes but must include one or more 
5 which cleave the RNA that is required for the expression of gag, pol or env. Preferably, a 
plurality of ribozymes is employed, together capable of cleaving gag, pol and env RNA of 
the native retrovirus at a plurality of sites. Since HIV exists as a population of 
quasispecies, not all of the target sequences for the ribozymes will be included in all HIV 
variants. The problem presented by this variability can be overcome by using multiple 

10 ribozymes. Multiple ribozymes can be included in series in a single vector and can 
function independently when expressed as a single RNA sequence. A single RNA 
containing two or more ribozymes having different target recognition sites may be referred 
to as a multitarget ribozyme. The placement of ribozymes in series has been demonstrated 
to enhance cleavage. The use of a plurality of ribozymes is not limited to treating HIV 

15 infection but may be used in relation to other viruses, retroviruses or otherwise. 

Antisense technology is well known on the art. There are various mechanisms by which 
antisense sequences are believed to inhibit gene expression. One mechanism by which 
antisense sequences are believed to function is the recruitment of the cellular protein 

20 RNAseH to the target sequence/antisense construct heteroduplex which results in cleavage 
and degradation of the heteroduplex. Thus the antisense construct, by contrast to 
ribozymes, can be said to lead indirectly to cleavage/degradation of the target sequence. 
Thus according to the present invention, a first nucleotide sequence may encode an 
antisense RNA that binds to either a gene encoding an essential/packaging component or 

25 the RNA transcribed from said gene such that expression of the gene is inhibited, for 
example as a result of RNAseH degradation of a resulting heteroduplex. It is not necessary 
for the antisense construct to encode the entire complementary sequence of the gene 
encoding an essential/packaging component - a portion may suffice. The skilled person 
will easily be able to determine how to design a suitable antisense construct. 

30 

By contrast, the nucleic acid sequences encoding the essential/packaging components of 
the viral particles required for the assembly of viral particles in the host cells/producer 



cells/packaging cells (the third nucleotide sequences) are resistant to the inhibitory RNA 
molecules encoded by the first nucleotide sequence. For example in the case of ribozymes, 
resistance is typically by virtue of alterations in the sequences which eliminate the 
ribozyme recognition sites. At the same time, the amino acid coding sequence for the 
5 essential/packaging components is retained so that the viral components encoded by the 
sequences remain the same, or at least sufficiently similar that the function of the 
essential/packaging components is not compromised. 

The term "viral polypeptide required for the assembly of viral particles" means a 
10 polypeptide normally encoded by the viral genome to be packaged into viral particles, in 
the absence of which the viral genome cannot be packaged. For example, in the context of 
retroviruses such polypeptides would include gag, pol and env. The terms "packaging 
component" and "essential component" are also included within this definition. 

15 In the case of antisense sequences, the third nucleotide sequence differs from the second 
nucleotide sequence encoding the target viral packaging component antisense sequence to 
the extent that although the antisense sequence can bind to the second nucleotide sequence, 
or transcript thereof, the antisense sequence can not bind effectively to the third nucleotide 
sequence or RNA transcribed from therefrom. The changes between the second and third 

20 nucleotide sequences will typically be conservative changes, although a small number of 
amino acid changes may be tolerated provided that, as described above, the function of the 
essential/packaging components is not significantly impaired. 

Preferably, in addition to eliminating the ribozyme recognition sites, the alterations to the 
25 coding sequences for the viral components improve the sequences for codon usage in the 
mammalian cells or other cells which are to act as the producer cells for retroviral vector 
particle production. This improvement in codon usage is referred to as "codon 
optimisation". Many viruses, including HIV and other lentiviruses, use a large number of 
rare codons and by changing these to correspond to commonly used mammalian codons, 
30 increased expression of the packaging components in mammalian producer cells can be 
achieved. Codon usage tables are known in the art for mammalian cells, as well as for a 
variety of other organisms. 
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Thus preferably, the sequences encoding the packaging components are codon optimised. 
More preferably, the sequences are codon optimised in their entirety. Following codon 
optimisation, it is found that there are numerous sites in the wild type gag, pol and env 
5 sequences which can serve as ribozyme recognition sites and which are no longer present 
in the sequences encoding the packaging components. In an alternative but less practical 
strategy, the sequences encoding the packaging components can be altered by targeted 
conservative alterations so as to render them resistant to selected ribozymes capable of 
cleaving the wild type sequences. 

10 

An additional advantage of codon optimising HIV packaging components is that this can 
increase gene expression. In particular, it can render gag, pol expression Rev independent 
so that rev and RRE need not be included in the genome (11). Rev-independent vectors are 
therefore possible. This in turn enables the use of anti-rev or RRE factors in the retroviral 
15 vector. 

As described above, the packaging components for a retroviral vector include expression 
products of gag, pol and env genes. In accordance with the present invention, gag and pol 
employed in the packaging system are derived from the target retrovirus on which the 

20 vector genome is based. Thus, in the RNA transcript form, gag and pol would normally be 
cleavable by the ribozymes present in the vector genome. The env gene employed in the 
packaging system may be derived from a different virus, including other retroviruses such 
as MLV and non-retroviruses such as VSV (a Rhabdovirus), in which case it may not need 
any sequence alteration to render it resistant to ribozyme cleavage. Alternatively, env may 

25 be derived from the same retrovirus as gag and pol, in which case any recognition sites for 
the ribozymes will need to be eliminated by sequence alteration. 

The process of producing a retroviral vector in which the envelope protein is not the native 
envelope of the retrovirus is known as "pseudotyping". Certain envelope proteins, such as 
30 MLV envelope protein and vesicular stomatitis virus G (VSV-G) protein, pseudotype 
retroviruses very well. Pseudotyping can be useful for altering the target cell range of the 
retrovirus. Alternatively, to maintain target cell specificity for target cells infected with the 



particular virus it is desired to treat, the envelope protein may be the same as that of the 
target virus, for example HIV. 

Other therapeutic coding sequences may be present along with the first nucleotide 
5 sequence or sequences. Other therapeutic coding sequences include, but are not limited to, 
sequences encoding cytokines, hormones, antibodies, immunoglobulin fusion proteins, 
enzymes, immune co-stimulatory molecules, anti-sense RNA, a transdominant negative 
mutant of a target protein, a toxin, a conditional toxin, an antigen, a single chain antibody, 
tumour suppresser protein and growth factors. When included, such coding sequences are 
10 operatively linked to a suitable promoter, which may be the promoter driving expression of 
the first nucleotide sequence or a different promoter or promoters. 



Thus the invention comprises two components. The first is a genome construction that will 
be packaged by viral packaging components and which carries a series of anti-viral 

15 inhibitory RNA molecules such as anti-HIV ribozymes (5, 7, 10, 13, 21, 36, 38, 40). These 
could be any anti-HIV ribozymes but the key issue for this invention is that some of them 
cleave RNA that is required for the expression of native or wild type HIV gag, pol or env 
coding sequences. The second component is the packaging system which comprises a 
cassette for the expression of HIV gag, pol and a cassette either for HIV env or an envelope 

20 gene encoding a pseudotyping envelope protein - the packaging system beig resistant to the 
inhibitory RNA molecules. 



The viral particles of the present invention, and the viral vector system and methods used 
to produce may thus be used to treat or prevent viral infections, preferably retroviral 
25 infections, in particular lentiviral, especially HIV, infections. Specifically, the viral 
particles of the invention, typically produced using the viral vector system of the present 
invention may be used to deliver inhibitory RNA molecules to a human or animal in need 
of treatment for a viral infection. 



30 Alternatively, or in addition, the viral production system may be used to transfect cells 
obtained from a patient ex vivo and then returned to the patient. Patient cells transfected ex 
vivo may be formulated as a pharmaceutical composition (see below) prior to 
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Preferably the viral particles are combined with a pharmaceutically acceptable carrier or 
diluent to produce a pharmaceutical composition. Thus, the present invention also provides 
5 a pharmaceutical composition for treating an individual, wherein the composition 
comprises a therapeutically effective amount of the viral particle of the present invention, 
together with a pharmaceutically acceptable carrier, diluent, excipient or adjuvant. The 
pharmaceutical composition may be for human or animal usage. 

10 The choice of pharmaceutical carrier, excipient or diluent can be selected with regard to the 
intended route of administration and standard pharmaceutical practice. Suitable carriers and 
diluents include isotonic saline solutions, for example phosphate-buffered saline. The 
pharmaceutical compositions may comprise as - or in addition to - the carrier, excipient or 
diluent any suitable binder(s), lubricant(s), suspending agent(s), coating agent(s), 

15 solubilising agent(s), and other carrier agents that may aid or increase the viral entry into 
the target site (such as for example a lipid delivery system). 

The pharmaceutical composition may be formulated for parenteral, intramuscular, 
intravenous, intracranial, subcutaneous, intraocular or transdermal administration. 

20 

Where appropriate, the pharmaceutical compositions can be administered by any one or 
more of: inhalation, in the form of a suppository or pessary, topically in the form of a 
lotion, solution, cream, ointment or dusting powder, by use of a skin patch, orally in the 
form of tablets containing excipients such as starch or lactose, or in capsules or ovules 

25 either alone or in admixture with excipients, or in the form of elixirs, solutions or 
suspensions containing flavouring or colouring agents, or they can be injected parenterally, 
for example intracavernosally, intravenously, intramuscularly or subcutaneously. For 
parenteral administration, the compositions may be best used in the form of a sterile 
aqueous solution which may contain other substances, for example enough salts or 

30 monosaccharides to make the solution isotonic with blood. For buccal or sublingual 
administration the compositions may be administered in the form of tablets or lozenges 
which can be formulated in a conventional manner. 
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The amount of virus administered is typically in the range of from 10 3 to 10 10 pfu, 
preferably from 10 s to 10 8 pfu, more preferably from 10 6 to 10 7 pfu. When injected, 
typically 1-10 p,l of virus in a pharmaceutically acceptable suitable carrier or diluent is 
5 administered. 

When the polynucleotide/vector is administered as a naked nucleic acid, the amount of 
nucleic acid administered is typically in the range of from 1 ug to 10 mg, preferably from 
100 ugto 1 mg. 

10 

Where the first nucleotide sequence (or other therapeutic sequence) is under the control of 
an inducible regulatory sequence, it may only be necessary to induce gene expression for 
the duration of the treatment. Once the condition has been treated, the inducer is removed 
and expression of the NOI is stopped. This will clearly have clinical advantages. Such a 
15 system may, for example, involve administering the antibiotic tetracycline, to activate gene 
expression via its effect on the tet repressor/VP16 fusion protein. 

The invention will now be further described by way of Examples, which are meant to serve 
to assist one of ordinary skill in the art in carrying out the invention and are not intended in 
20 any way to limit the scope of the invention. The Examples refer to the Figures. In the 
Figures: 

Figure 1 shows schematically ribozymes inserted into four different HIV vectors; 

25 Figure 2 shows schematically how to create a suitable 3 ' LTR by PCR; 

Figure 3 shows the codon usage table for wild type HIV gag.pol of strain HXB2 (accession 
number: K03455). 

30 Figure 4 shows the codon usage table of the codon optimised sequence designated gag,pol- 
SYNgp. 
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Figure 5 shows the codon usage table of the wild type HIV env called env-mn. 

Figure 6 shows the codon usage table of the codon optimised sequence of HIV env 
5 designated SYNgpl60mn. 

Figure 7 shows three plasmid constructs for use in the invention. 

Figure 8 shows the principle behind two systems for producing retroviral vector particles. 

10 

The invention will now be further described in the Examples which follow, which are 
intended as an illustration only and do not limit the scope of the invention. 

EXAMPLES 

15 

Example 1 - Construction of a Genome 

The HIV gag.pol sequence was codon optimised (Figure 4 and SEQ I.D. No. 1) and 
synthesised using overlapping oligos of around 40 nucleotides. This has three advantages. 
20 Firstly it allows an HIV based vector to carry ribozymes and other therapeutic factors. 
Secondly the codon optimisation generates a higher vector titre due to a higher level of 
gene expression. Thirdly gag.pol expression becomes rev independent which allows the 
use of anti-rev or RRE factors. 

25 Conserved sequences within gag.pol were identified by reference to the HIV Sequence 
database at Los Alamos National Laboratory (http:// hiv-web.lanl.gov/) and used to design 
ribozymes. Because of the variability between subtypes of HIV-1 the ribozymes were 
designed to cleave the predominant subtype within North America, Latin America and the 
Caribbean, Europe, Japan and Australia; that is subtype B. The sites chosen were cross- 

30 referenced with the synthetic gagpol sequence to ensure that there was a low possibility of 
cutting the codon optimised gagpol mRNA. The ribozymes were designed with Xhol and 
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SaR sites at the 5' and 3' end respectively. This allows the construction of separate and 
tandem ribozymes. 

The ribozymes are hammerhead (25) structures of the following general structure: 

Helix I Helix II Helix III 

5 ' - NNNNNNNN- CUGAUGAGGCCGAAAGGCCGAA -NNNNNNNN- 



The catalytic domain of the ribozyme (Helix II) can tolerate some changes without 
1 0 reducing catalytic turnover. 



The cleavage sites, targeting gag and pol, with the essential GUX triplet (where X is any 
nucleotide base) are as follows: 



15 


GAG 


1 


5 ' 


UAGUAAG AAUGUAUAG C C CUAC 




GAG 


2 


5 ' 


AACCCAGAUUGUAAGACUAUUU 




GAG 


3 


5 ' 


UGUUDCAAUUGUGGCAAAGAAG 




GAG 


4 


5 ' 


AAAAAGGGCUGTJUGGAAAUGUG 




POL 


1 


5 ' 


ACGACCCCUCGUCACAAUAAAG 


20 


POL 


2 


5 ' 


GGAAUUGGAGGUUUUAUCAAAG 




POL 


3 


5 ' 


AUAUUUUUCAGUUCCCUUAGAU 




POL 


4 


5 ' 


UGGAUGAUUUGTJAUGUAGGAUC 




POL 


5 


5 ' 


CUUUGGAUGGGUUAUGAACUCC 




POL 


6 


5 1 


CAGCUGGACUGUCAAUGACAUA 


25 


POL 


7 


5 ' 


AACUUUCUAUGUAGAUGGGGCA 




POL 


8 


5 1 


AAGGCCGCCUGUUGGUGGGCAG 




POL 


9 


5 1 


UAAGACAGCAGUACAAAUGGCA 



The ribozymes are inserted into four different HIV vectors (pH4 (10), pH6, pH4.1, or 
30 pH6. 1) (Figure 1). In pH4 and pH6, transcription of the ribozymes is driven by an internal 
HCMV promoter (9). From pH4.1 and pH6.1, the ribozymes are expressed from the 5' 
LTR. The major difference between pH4 and pH6 (and pH4.1 and pH6.1) resides in the 3' 
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LTR in the production plasmid. pH4 and pH4.1 have the HIV U3 in the 3' LTR. pH6 and 
pH6.1 have HCMV in the 3'LTR. The HCMV promoter replaces most of the U3 and will 
drive expression at high constitutive levels while the HIV-1 U3 will support a high level of 
expression only in the presence of Tat. 

The HCMV/HIV-1 hybrid 3' LTR is created by recombinant PCR with three PCR primers 
(Figure 2). The first round of PCR is performed with RIB1 and RIB2 using pH4 (12) as 
the template to amplify the HIV-1 HXB2 sequence 8900-9123. The second round of PCR 
makes the junction between the 5' end of the HIV-1 U3 and the HCMV promoter by 
amplifying the hybrid 5' LTR from pH4. The PCR product from the first PCR reaction and 
RIB 3 serves as the 5' primer and 3' primer respectively. 



RIB1 
RIB2 
15 RIB3 



5 ' - CAGCTGCTCGAGCAGCTGAAGCTTGCATGC- 3 ' 

5 ' -GTAAGTTATGTAACGGACGATATCTTGTCTTCTT- 3 ' 

5 ' -CGCATAGTCGACGGGCCCGCCACTGCTAGAGATTTTC-3 ' 



The PCR product is then cut with Sphl and Sail and inserted into pH4 thereby replacing the 
3' LTR. The resulting plasmid is designated pH6. To construct pH4.1 and pH6.1, the 
internal HCMV promoter (Spel - Xhol) in pH4 and pH6 is replaced with the polycloning 
site of pBluescript II KS+ (Stratagene) (Spel -Xhol). 

The ribozymes are inserted into the Xhol sites in the genome vector backbones. Any 
ribozymes in any configuration could be used in a similar way. 

Example 2 - Construction of a Packaging System 

The packaging system can take various forms. In a first form of packaging system, the 
HIV gag, pol components are co-expressed with the HIV env coding sequence. In this 
case, both the gag, pol and the env coding sequences are altered such that they are resistant 
to the anti-HIV ribozymes that are built into the genome. At the same time as altering the 
codon usage to achieve resistance, the codons can be chosen to match the usage pattern of 
the most highly expressed mammalian genes. This dramatically increases expression 
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levels (28, 29) and so increases titre. A codon optimised HIV env coding sequence has 
been described by Haas et al (9). In the present example, a modified codon optimised HIV 
env sequence is used (SEQ I.D. No. 3). The corresponding env expression plasmid is 
designated pSYNgpl 60mn. The modified sequence contains extra motifs not used by Haas 
et al. The extra sequences were taken from the HIV env sequence of strain MN and codon 
optimised. Any similar modification of the nucleic acid sequence would function similarly 
as long as it used codons corresponding to abundant tRNAs (42) and lead to resistance to 
the ribozymes in the genome. 



10 In one example of a gag, pol coding sequence with optimised codon usage, overlapping 
oligonucleotides are synthesised and then ligated together to produce the synthetic coding 
sequence. The sequence of a wild-type (Genbank accession no. K03455) and synthetic 
(gagpol-SYNgp) gagpol sequence is shown in SEQ I.D. Nos 1 and 2, respectively and their 
codon usage is shown in Figures 3 and 4, respectively. The sequence of a wild type env 

15 coding sequence (Genbank Accession No. Ml 7449) is given in SEQ I.D. No 3, the 
sequence of a synthetic codon optimised sequence is given in SEQ. I.D. No. 4 and their 
codon usage tables are given in Figures 5 and 6, respectively. As with the env coding 
sequence any gag, pol sequence that achieves resistance to the ribozymes could be used. 
The synthetic sequence shown is designated gag, pol-SYNgp and has an EcoRl site at the 5' 

20 end and a Not\ site at the 3' end. It is inserted into pClneo (Promega) to produce plasmid 
pSYNgp. 



In a second form of the packaging system a synthetic gag, pol cassette is coexpressed with 
a non-HIV envelope coding sequence that produces a surface protein that pseudotypes 

25 HIV. This could be for example VSV-G (20, 41), amphotropic MLV env (6, 34) or any 
other protein that would be incorporated into the HIV particle (37). This includes 
molecules capable of targeting the vector to specific tissues. Coding sequences for non- 
HIV envelope proteins not cleaved by the ribozymes and so no sequence modification is 
required (although some sequence modification may be desirable for other reasons such as 

30 optimisation for codon usage in mammalian cells). 
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Vector particles can be produced either from a transient three-plasmid transfection system 
similar to that described by Soneoka et al (33) or from producer cell lines similar to those 

5 used for other retroviral vectors (20, 35, 39). These principles are illustrated in Figures 7 
and 8. For example, by using pH6Rz, pSYNgp and pRV67 (VSV-G expression plasmid) 
in a three plasmid transfection of 293T cells (Figure 8), as described by Soneoka et al (33), 
vector particles designated H6Rz-VSV are produced. These transduce the H6Rz genome 
to CD4+ cells such as CI 866 or Jurkat and produce the multitarget ribozymes. HIV 

10 replication in these cells is now severely restricted. 

All publications mentioned in the above specification are herein incorporated by reference. 
Various modifications and variations of the described methods and system of the invention 
will be apparent to those skilled in the art without departing from the scope and spirit of the 
15 invention. Although the invention has been described in connection with specific preferred 
embodiments, it should be understood that the invention as claimed should not be unduly 
limited to such specific embodiments. Indeed, various modifications of the described 
modes for carrying out the invention which are obvious to those skilled in molecular 
biology or related fields are intended to be within the scope of the following claims. 

20 
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1 . A viral vector system comprising: 

(i) a first nucleotide sequence encoding a gene product capable of binding to and 
effecting the cleavage, directly or indirectly, of a second nucleotide sequence, or 
transcription product thereof, encoding a viral polypeptide required for the assembly of 
viral particles; and 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of viral particles, which third nucleotide sequence has a different nucleotide 
sequence to the second nucleotide sequence such that the third nucleotide sequence, or 
transcription product thereof, is resistant to cleavage directed by said gene product. 

2. A viral vector production system comprising: 

(i) a viral genome comprising at least one first nucleotide sequence encoding a gene 
product capable of binding to and effecting the cleavage, directly or indirectly, of a second 
nucleotide sequence, or transcription product thereof, encoding a viral polypeptide required 
for the assembly of viral particles; 

(ii) a third nucleotide sequence encoding said viral polypeptide required for the 
assembly of the viral genome into viral particles, which third nucleotide sequence has a 
different nucleotide sequence to the second nucleotide sequence such that said third 
nucleotide sequence, or transcription product thereof, is resistant to cleavage directed by 
said gene product. 

3. A system according to claim 1 or 2 wherein the gene product is selected from a 
ribozyme and an anti-sense ribonucleic acid. 

4. A system according to any one of claims 1 to 3 wherein the viral vector is a 
retroviral vector. 

5. A system according to claim 4 wherein the retroviral vector is a lentiviral vector. 

6. A system according to claim 5 wherein the lentiviral vector is an HIV vector. 
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7. A system according to any one of claims 4 to 6 wherein the polypeptide required 
for the assembly of viral particles is selected from gag, pol and env proteins. 

8. A system according to claim 7 wherein at least the gag and pol proteins are from a 
lenti virus. 

9. A system according to claim 7 wherein the env protein is from a lentivirus. 

10. A system according to claim 8 or 9 wherein the lentivirus is HIV. 

11. A system according to any one of the preceding claims wherein the third nucleotide 
sequence is resistant to cleavage directed by the gene product as a result of one or more 
conservative alterations in the nucleotide sequence which remove cleavage sites recognised 
by the at least one gene product and/or binding sites for the at least one gene product 

12. A system according to any one of claims 1 to 10 wherein the third nucleotide 
sequence is adapted to be resistant to cleavage by the at least one gene product. 

13. A system according to any one of the preceding claims wherein the third nucleotide 
sequence is codon optimised for expression in producer cells. 

14. A system according to claim 1 3, wherein the producer cells are mammalian cells. 

15. A system according to any one of the preceding claims comprising a plurality of 
first nucleotide sequences and third nucleotide sequences as defined therein. 

16. A viral particle comprising a viral vector genome as defined in any one of claims 2 
to 1 5 and one or more third nucleotide sequences as defined in any of claims 2 to 15. 

17. A viral particle produced using a viral vector production system according to any 
one of claims 2 to 15. 
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18. A method for producing a viral particle which method comprises introducing into a 
host cell (i) a viral genome as defined in any one of claims 2 to 15 (ii) one or more third 
nucleotide sequences as defined in any of claims 2 to 15 and (iii) nucleotide sequences 
encoding the other essential viral packaging components not encoded by the one or more 
third nucleotide sequences. 

19. A viral particle produced by the method of claim 1 8. 

20. A pharmaceutical composition comprising a viral particle according to claims 16, 
17 or 19 together with a pharmaceutically acceptable carrier or diluent. 

21. A viral system according to any one of claims 1 to 16 or a viral particle according 
to claims 16, 17 or 19 in treating a viral infection. 

22. A viral system according to any one of claims 1 to 16 for use in a method of 
producing viral particles. 
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Figure 3 



gagpol-HXB2 -> Codon Usage 

DNA sequence 4308 b.p. ATGGGTGCGAGA ... GATGAGGATTAG linear 



MW : 161929 Dalton CAI(S.c) : 0.083 CAI(E.c) : 0.151 

TTT phe F 21 TCT ser S 3 TAT tyr Y 3 0 TGT cys C 

TTC phe F 14 TCC ser S 3 TAC tyr Y 9 TGC cys C 

TTA leu L 46 TCA ser S 19 TAA OCH Z - TGA OPA Z 

TTG leu L 11 TCG ser S 1 TAG AMB Z 1 TGG trp W 

CTT leu L 13 CCT pro P 21 CAT his H 20 CGT arg R 

CTC leu L 7 CCC pro P 14 CAC his H 7 CGC arg R 

CTA leu L 17 CCA pro P 41 CAA gin Q 5S CGA arg R 

CTG leu L 16 CCG pro P - CAG gin Q 3 9 CGG arg R 

ATT ile I 30 ACT thr T 24 AAT asn N 42 AGT ser S 

ATC.ile I 14 ACC thr T 20 AAC asn N 16 AGC ser S 

ATA ile 1 56 ACA thr T 43 AAA lys X 88 AGA arg R 

ATG met M 29 ACG thr T 1 AAG lys K 34 AGG arg R 

GTT val V 15 GCT ala A 17 GAT asp D 37 GGT gly G 

GTC val V 11 GCC ala A 19 GAC asp D 26 GGC gly G 

GTA val V 55 GCA ala A 55 GAA glu E 75 GGA gly G 

GTG val V 15 GCG ala A 5 GAG glu E 32 GGG gly G 
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Figure 4 



gagpol-SYNgp [1 to 4308] -> Codon Osage 

DNA sequence 4308 b.p. ATGGGCGCCCGC ... GATGAGGATTAG linear 



1436 codons 



MW : 161929 I 

TTT phe F 5 

TTC phe F 30 

TTA leu L 2 

TTG leu L 7 

CTT leu L 3 

CTC leu L 22 

CTA leu L 6 

CTG leu L 70 

ATT ile I 17 

ATC ile I 79 

ATA ile I 4 

ATG met M 29 

GTT val V 5 

GTC val V 27 

GTA val V 6 

GTG val V 58 



lton CAI(S.c) 

TCT ser S 5 

TCC ser S 11 

TCA ser S 4 

TCG ser S 6 

CCT pro P 14 

CCC pro P 3 9 

CCA pro P 10 

CCG pro P 13 

ACT thr T 11 

ACC thr T 48 

ACA thr T 13 

ACG thr T 16 

GCT ala A 15 

GCC ala A 56 

GCA ala A 13 

GCG ala A 12 



: 0.080 CAI(E. i 

TAT tyr Y 10 

TAC tyr Y 2 9 
TAA OCH Z 
TAG AMB Z 1 

CAT his H 6 

CAC his H 21 

CAA gin Q 14 

CAG gin Q 81 

AAT asn N 13 

AAC asn N 45 

AAA lys K 25 

AAG lys K 97 

GAT asp D 19 

GAC asp D 44 

GAA glu E 29 

GAG glu E 78 



:.) : 0.296 

TGT cys C 6 

TGC cys C 14 
TGA CPA Z 

TGG trp W 37 

CGT arg R 2 

CGC arg R 34 

CGA org R 3 
CGG arg R . 10 

AGT ser S 7 

AGC ser S 27 

AGA arg R 7 

AGG arg R 13 

GGT gly G 10 

GGC gly G 54. 

GGA gly G 16 

GGG gly G 28 
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Figure 5 



env-mn [1 to 2571] -> Codon Usage 

DNA sequence 2571 b.p. ATGAGAGTGAAG ... GCTTTGCTATAA linear 



TTT phe F 13 TCT ser S 7 TAT tyr Y 15 TGT cys C 

TTC phe F 11 TCC ser S 3 TAC tyr Y 7 TGC cys C 

TTA leu L 20 TCA ser S 13 TAA OCH Z 1 TGA OPA Z 

TTG leu L 17 TCG ser S 2 TAG AMB Z - TGG trp W 

CTT leu L 9 CCT pro P 5 CAT his H 8 CGT arg R 

CTC leu L 11 CCC pro P 9 CAC his H 6 CGC arg R 

CTA leu L 12 CCA pro P 12 CAA gin Q 22 CGA arg R 

CTG leu L 15 CCG pro P 2 CAG gin Q 19 CGG arg R 

ATT ile I 21 ACT thr T 16 AAT asn N SO AGT ser S 

ATC ile I 10 ACC thr T 14 AAC asn N 13 AGC ser S 

ATA ile I ' 32 ACA thr T 28 AAA lys K 32 AGA arg R 

ATG met M 17 ACG thr T 5 AAG lys K 14 AGG arg R 

GTT val V 8 GCT ala A IS GAT asp D 18 GGT gly G 

GTC val V 9 GCC ala A 7 GAC asp D 14 GGC gly G 

GTA val V 26 GCA ala A 20 GAA glu E 36 GGA gly G 

GTG val V 12 GCG ala A 5 GAG glu E 10 GGG gly G 
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Figure 6 



SYNgpl60mn -> Codon Usage 

DMA sequence 2571 b.p. ATGAGGGTGAAG . . . GCGCTGCTGTAA linear 



MW : 97078 Dalton CAI(S.c) : 0.074 CAI(E.c 
TTT phe F - TCT ser S 2 TAT tyr Y 1 TGT cys C 



TTC phe F - . 

TTA leu L - TCA ser S - TAA OCH Z 1 TGA OPA 2 
TTG leu L 



TAC tyr Y 21 TGC cys C 
OCH Z 1 TGA OPA 2 
AMB Z - TGG trp V> 



CTT leu L - CCT pro P CAT his H 2 CGT arg R 

CTC leu L 20 CCC pro P 26 CAC his H 12 CGC arg R 

CTA leu L 1 CCA pro P - CAA gin Q - CGA arg R 

CTG leu L 63 CCG pro P 2 CAG gin Q 41 CGG arg R 

ATT ile I 2 ACT thr T - AAT asn N 2 AGT ser S 

ATC ile I 61 ACC thr T 59 AAC asn N 61 AGC ser S 

ATA ile I - ACA thr T - AAA lys K 1 AGA arg R 

ATG met M 17 ACG thr T 4 AAG lys K 45 AGG arg R 

GTT val V - ■ GCT ala A . - GAT asp D 2 GOT gly G 

GTC val V 1 GCC ala A 40 GAC asp D 3 0 GGC gly G 

GTA val V 1 GCA ala A - GAA glu E 3 GGA gly O 

GTG val V S3 GCG ala A 8 GAG glu E 43 GGG gly G 
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SEQUENCE L IS TIN G PART OF THE DESCRIPTION 



SEQ. ID. NO. 1 - Wild type gagpol sequence for strain HXB2 (accession no. K03455) 

ATGGGTGCGA GAGCGTCAGT ATTAAGCGGG GGAGAATTAG ATCGATGGGA AAAAATTCGG 60 
TTAAGGCCAG GGGGAAAGAA AAAATATAAA TTAAAACATA TAGTATGGGC AAGCAGGGAG 120 
CTAGAACGAT TCGCAGTTAA TCCTGGCCTG TTAGAAACAT CAGAAGGCTG TAGACAAATA 180 
CTGGGACAGC TACAACCATC CCTTCAGACA GGATCAGAAG AACTTAGATC ATTATATAAT 240 
ACAGTAGCAA CCCTCTATTG TGTGCATCAA AGGATAGAGA TAAAAGACAC CAAGGAAGCT 300 
TTAGACAAGA TAGAGGAAGA GCAAAACAAA AGTAAGAAAA AAGCACAGCA AGCAGCAGCT 360 
GACACAGGAC ACAGCAATCA GGTCAGCCAA AATTACCCTA TAGTGCAGAA CATCCAGGGG 420 
CAAATGGTAC ATCAGGCCAT ATCACCTAGA ACTTTAAATG CATGGGTAAA AGTAGTAGAA 480 
GAGAAGGCTT TCAGCCCAGA AGTGATACCC ATGTTTTCAG CATTATCAGA AGGAGCCACC 540 
CCACAAGATT TAAACACCAT GCTAAACACA GTGGGGGGAC ATCAAGCAGC CATGCAAATG 600 
TTAAAAGAGA CCATCAATGA GGAAGCTGCA GAATGGGATA GAGTGCATCC AGTGCATGCA 660 
GGGCCTATTG CACCAGGCCA GATGAGAGAA CCAAGGGGAA GTGACATAGC AGGAACTACT 720 
AGTACCCTTC AGGAACAAAT AGGATGGATG ACAAATAATC CACCTATCCC AGTAGGAGAA 780 
ATTTATAAAA GATGGATAAT CCTGGGATTA AATAAAATAG TAAGAATGTA TAGCCCTACC 840 
AGCATTCTGG ACATAAGACA AGGACCAAAG GAACCCTTTA GAGACTATGT AGACCGGTTC 900 
TATAAAACTC TAAGAGCCGA GCAAGCTTCA CAGGAGGTAA AAAATTGGAT GACAGAAACC 960 
TTGTTGGTCC AAAATGCGAA CCCAGATTGT AAGACTATTT TAAAAGCATT GGGACCAGCG 1020 
GCTACACTAG AAGAAATGAT GACAGCATGT CAGGGAGTAG GAGGACCCGG CCATAAGGCA 1080 
AGAGTTTTGG CTGAAGCAAT GAGCCAAGTA ACAAATTCAG CTACCATAAT GATGCAGAGA 1140 
GGCAATTTTA GGAACCAAAG AAAGATTGTT AAGTGTTTCA ATTGTGGCAA AGAAGGGCAC 1200 
ACAGCCAGAA ATTGCAGGGC CCCTAGGAAA AAGGGCTGTT GGAAATGTGG AAAGGAAGGA 1260 
CACCAAATGA AAGATTGTAC TGAGAGACAG GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 
TACAAGGGAA GGCCAGGGAA TTTTCTTCAG AGCAGACCAG AGCCAACAGC CCCACCAGAA 1380 
GAGAGCTTCA GGTCTGGGGT AGAGACAACA ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGG TCACTCTTTG GCAACGACCC CTCGTCACAA 1500 
TAAAGATAGG GGGGCAACTA AAGGAAGCTC TATTAGATAC AGGAGCAGAT GATACAGTAT 1560 
TAGAAGAAAT GAGTTTGCCA GGAAGATGGA AACCAAAAAT GATAGGGGGA ATTGGAGGTT 1620 
TTATCAAAGT AAGACAGTAT GATCAGATAC TCATAGAAAT CTGTGGACAT AAAGCTATAG 1680 
GTACAGTATT AGTAGGACCT ACACCTGTCA ACATAATTGG AAGAAATCTG TTGACTCAGA 1740 
TTGGTTGCAC TTTAAATTTT CCCATTAGCC CTATTGAGAC TGTACCAGTA AAATTAAAGC 1800 
CAGGAATGGA TGGCCCAAAA GTTAAACAAT GGCCATTGAC AGAAGAAAAA ATAAAAGCAT 1860 
TAGTAGAAAT TTGTACAGAG ATGGAAAAGG AAGGGAAAAT "n-CAAAAATT GGGCCTGAAA 1920 
ATCCATACAA TACTCCAGTA TTTGCCATAA AGAAAAAAGA CAGTACTAAA TGGAGAAAAT 1980 
TAGTAGATTT CAGAGAACTT AATAAGAGAA CTCAAGACTT CTGGGAAGTT CAATTAGGAA 2040 
TACC ACATC C CGCAGGGTTA AAAAAGAAAA AATCAGTAAC AGTACTGGAT GTGGGTGATG 2100 
CATATTTTTC AGTTCCCTTA GATGAAGACT TCAGGAAGTA TACTGCATTT ACCATACCTA 2160 
GTATAAACAA TGAGACACCA GGGATTAGAT ATCAGTACAA TGTGCTTCCA CAGGGATGGA 2220 
AAGGATCACC AGCAATATTC CAAAGTAGCA TGACAAAAAT CTTAGAGCCT TTTAGAAAAC 2280 
AAAATCCAGA CATAGTTATC TATCAATACA TGGATGATTT GTATGTAGGA TCTGACTTAG 2340 
AAATAGGGCA GCATAGAACA AAAATAGAGG AGCTGAGACA ACATCTGTTG AGGTGGGGAC 2400 
mCCACACC AGACAAAAAA CATCAGAAAG AACCTCCATT CCTTTGGATG GGTTATGAAC 2460 
TCCATCCTGA TAAATGGACA GTACAGCCTA TAGTGCTGCC AGAAAAAGAC AGCTGGACTG 2520 
TCAATGACAT ACAGAAGTTA GTGGGGAAAT TGAATTGGGC AAGTCAGATT TACCCAGGGA 2580 
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TTAAAGTAAG GCAATTATGT AAACTCCTTA GAGGAACCAA AGCACTAACA GAAGTAATAC 2640 
CACTAACAGA AGAAGCAGAG CTAGAACTGG CAGAAAACAG AGAGATTCTA AAAGAACCAG 2700 
TACATGGAGT GTATTATGAC CCATCAAAAG ACTTAATAGC AGAAATACAG AAGCAGGGGC 2760 
AAGGCCAATG GACATATCAA ATTTATCAAG AGCCATTTAA AAATCTGAAA ACAGGAAAAT 2820 
ATGCAAGAAT GAGGGGTGCC CACACTAATG ATGTAAAACA ATTAACAGAG GCAGTGCAAA 2880 
AAATAACCAC AGAAAGCATA GTAATATGGG GAAAGACTCC TAAATTTAAA CTGCCCATAC 2940 
AAAAGGAAAC ATGGGAAACA TGGTGGACAG AGTATTGGCA AGCCACCTGG ATTCCTGAGT 3000 
GGGAGTTTGT TAATACCCCT CCCTTAGTGA AATTATGGTA CCAGTTAGAG AAAGAACCCA 3060 
TAGTAGGAGC AGAAACCTTC TATGTAGATG GGGCAGCTAA CAGGGAGACT AAATTAGGAA 3120 
AAGCAGGATA TGTTACTAAT AGAGGAAGAC AAAAAGTTGT CACCCTAACT GACACAACAA 3180 
ATCAGAAGAC TGAGTTACAA GCAATTTATC TAGCTTTGCA GGATTCGGGA TTAGAAGTAA 3240 
ACATAGTAAC AGACTCACAA TATGCATTAG GAATCATTCA AGCACAACCA GATCAAAGTG 3300 
AATCAGAGTT AGTCAATCAA ATAATAGAGC AGTTAATAAA AAAGGAAAAG GTCTATCTGG 3360 
CATGGGTACC AGCACACAAA GGAATTGGAG GAAATGAACA AGTAGATAAA TTAGTCAGTG 3420 
CTGGAATCAG GAAAGTACTA TTTTTAGATG GAATAGATAA GGCCCAAGAT GAACATGAGA 3480 
AATATCACAG TAATTGGAGA GCAATGGCTA GTGATTTTAA CCTGCCACCT GTAGTAGCAA 3540 
AAGAAATAGT AGCCAGCTGT GATAAATGTC AGCTAAAAGG AGAAGCCATG CATGGACAAG 3600 
TAGACTGTAG TCCAGGAATA TGGCAACTAG ATTGTACACA TTTAGAAGGA AAAGTTATCC 3660 
TGGTAGCAGT TCATGTAGCC AGTGGATATA TAGAAGCAGA AGTTATTCCA GCAGAAACAG 3720 
GGCAGGAAAC AGCATATTTT CTTTTAAAAT TAGCAGGAAG ATGGCCAGTA AAAACAATAC 3780 
ATACTGACAA TGGCAGCAAT TTCACCGGTG CTACGGTTAG GGCCGCCTGT TGGTGGGCGG 3840 
GAATCAAGCA GGAATTTGGA ATTCCCTACA ATCCCCAAAG TCAAGGAGTA GTAGAATCTA 3900 
TGAATAAAGA ATTAAAGAAA ATTATAGGAC AGGTAAGAGA TCAGGCTGAA CATCTTAAGA 3960 
CAGCAGTACA AATGGCAGTA TTCATCCACA ATTTTAAAAG AAAAGGGGGG ATTGGGGGGT 4020 
ACAGTGCAGG GGAMGAATA GTAGACATAA TAGCAACAGA CATACAAACT AAAGAATTAC 4080 
AAAAACAAAT TACAAAAATT CAAMTTTTC GGGTTTATTA CAGGGACAGC AGAAATTCAC 4140 
TTTGGAAAGG ACCAGCAAAG CTCCTCTGGA AAGGTGAAGG GGCAGTAGTA ATACAAGATA 4200 
ATAGTGACAT AAAAGTAGTG CCAAGAAGAA AAGCAAAGAT CATTAGGGAT TATGGAAAAC 4260 
AGATGGCAGG TGATGATTGT GTGGCAAGTA GACAGGATGA GGATTAG 4307 



SEQ I.D. NO. 2 - gagpol-SYNgp - codon optimised gagpol sequence 

ATGGGCGCCC GCGCCAGCGT GCTGTCGGGC GGCGAGCTGG ACCGCTGGGA GAAGATCCGC 60 
CTGCGCCCCG GCGGCAAAAA GAAGTACAAG CTGAAGCACA TCGTGTGGGC CAGCCGCGAA 120 
CTGGAGCGCT TCGCCGTGAA CCCCGGGCTC CTGGAGACCA GCGAGGGGTG CCGCCAGATC 180 
CTCGGCCAAC TGCAGCCCAG CCTGCAAACC GGCAGCGAGG AGCTGCGCAG CCTGTACAAC 240 
ACCGTGGCCA CGCTGTACTG CGTCCACCAG CGCATCGAAA TCAAGGATAC GAAAGAGGCC 300 
CTGGATAAAA TCGAAGAGGA ACAGAATAAG AGCAAAAAGA AGGCCCAACA GGCCGCCGCG 360 
GACACCGGAC ACAGCAACCA GGTCAGCCAG AACTACCCCA TCGTGCAGAA CATCCAGGGG 420 
CAGATGGTGC ACCAGGCCAT CTCCCCCCGC ACGCTGAACG CCTGGGTGAA GGTGGTGGAA 480 
GAGAAGGCTT TTAGCCCGGA GGTGATACCC ATGTTCTCAG CCCTGTCAGA GGGAGCCACC 540 
CCCCAAGATC TGAACACCAT GCTCAACACA GTGGGGGGAC ACCAGGCCGC CATGCAGATG 600 
CTGAAGGAGA CCATCAATGA GGAGGCTGCC GAATGGGATC GTGTGCATCC GGTGCACGCA 660 
GGGCCCATCG CACCGGGCCA GATGCGTGAG CCACGGGGCT CAGACATCGC CGGAACGACT 720 
AGTACCCTTC AGGAACAGAT CGGCTGGATG ACCAACAACC CACCCATCCC GGTGGGAGAA 780 
ATCTACAAAC GCTGGATCAT CCTGGGCCTG AACAAGATCG TGCGCATGTA TAGCCCTACC 840 
AGCATCCTGG ACATCCGCCA AGGCCCGAAG GAACCCTTTC GCGACTACGT GGACCGGTTC 900 
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TACAAAACGC TCCGCGCCGA GCAGGCTAGC CAGGAGGTGA AGAACTGGAT GACCGAAACC 960 
CTGCTGGTCC AGAACGCGAA CCCGGACTGC AAGACGATCC TGAAGGCCCT GGGCCCAGCG 1020 
GCTACCCTAG AGGAAATGAT GACCGCCTGT CAGGGAGTGG GCGGACCCGG CCACAAGGCA 1080 
CGCGTCCTGG CTGAGGCCAT GAGCCAGGTG ACCAACTCCG CTACCATCAT GATGCAGCGC 1140 
GGCAACTTTC GGAACCAACG CAAGATCGTC AAGTGCTTCA ACTGTGGCAA AGAAGGGCAC 1200 
ACAGCCCGCA ACTGCAGGGC CCCTAGGAAA AAGGGCTGCT GGAAATGCGG CAAGGAAGGC 1260 
CACCAGATGA AAGACTGTAC TGAGAGACAG GCTAATTTTT TAGGGAAGAT CTGGCCTTCC 1320 
TACAAGGGAA GGCCAGGGAA TTTTCTTCAG AGCAGACCAG AGCCAACAGC CCCACCAGAA 1380 
GAGAGCTTCA GGTCTGGGGT AGAGACAACA ACTCCCCCTC AGAAGCAGGA GCCGATAGAC 1440 
AAGGAACTGT ATCCTTTAAC TTCCCTCAGA TCACTCTTTG GCAACGACCC CTCGTCACAA 1500 
TAAAGATAGG GGGGCAGCTC AAGGAGGCTC TCCTGGACAC CGGAGCAGAC GACACCGTGC 1560 
TGGAGGAGAT GTCGTTGCCA GGCCGCTGGA AGCCGAAGAT GATCGGGGGA ATCGGCGGTT 1620 
TCATCAAGGT GCGCCAGTAT GACCAGATCC TCATCGAAAT CTGCGGCCAC AAGGCTATCG 1680 
GTACCGTGCT GGTGGGCCCC ACACCCGTCA ACATCATCGG ACGCAACCTG TTGACGCAGA 1740 
TCGGTTGCAC GCTGAACTTC CCCATTAGCC CTATCGAGAC GGTACCGGTG AAGCTGAAGC 1800 
CCGGGATGGA CGGCCCGAAG GTCAAGCAAT GGCCATTGAC AGAGGAGAAG ATCAAGGCAC 1860 
TGGTGGAGAT TTGCACAGAG ATGGAAAAGG AAGGGAAAAT CTCCAAGATT GGGCCTGAGA 1920 
ACCCGTACAA CACGCCGGTG TTCGCAATCA AGAAGAAGGA CTCGACGAAA TGGCGCAAGC 1980 
TGGTGGACTT CCGCGAGCTG AACAAGCGCA CGCAAGACTT CTGGGAGGTT CAGCTGGGCA 2040 
TCCCGCACCC CGCAGGGCTG AAGAAGAAGA AATCCGTGAC CGTACTGGAT GTGGGTGATG 2100 
CCTACHCTC CGTTCCCCTG GACGAAGACT TCAGGAAGTA CACTGCCTTC ACAATCCCTT 2160 
CGATCAACAA CGAGACACCG GGGATTCGAT ATCAGTACAA CGTGCTGCCC CAGGGCTGGA 2220 
AAGGCTCTCC CGCAATCTTC CAGAGTAGCA TGACCAAAAT CCTGGAGCCT TTCCGCAAAC 2280 
AGAACCCCGA CATCGTCATC TATCAGTACA TGGATGACTT GTACGTGGGC TCTGATCTAG 2340 
AGATAGGGCA GCACCGCACC AAGATCGAGG AGCTGCGCCA GCACCTGTTG AGGTGGGGAC 2400 
TGACCACACC CGACAAGAAG CACCAGAAGG AGCCTCCCTT CCTCTGGATG GGTTACGAGC 2460 
TGCACCCTGA CAAATGGACC GTGCAGCCTA TCGTGCTGCC AGAGAAAGAC AGCTGGACTG 2520 
TCAACGACAT ACAGAAGCTG GTGGGGAAGT TGAACTGGGC CAGTCAGATT TACCCAGGGA 2580 
TTAAGGTGAG GCAGCTGTGC AAACTCCTCC GCGGAACCAA GGCACTCACA GAGGTGATCC 2640 
CCCTAACCGA GGAGGCCGAG CTCGAACTGG CAGAAAACCG AGAGATCCTA AAGGAGCCCG 2700 
TGCACGGCGT GTACTATGAC CCCTCCAAGG ACCTGATCGC CGAGATCCAG AAGCAGGGGC 2760 
AAGGCCAGTG GACCTATCAG ATTTACCAGG AGCCCTTCAA GAACCTGAAG ACCGGCAAGT 2820 
ACGCCCGGAT GAGGGGTGCC CACACTAACG ACGTCAAGCA GCTGACCGAG GCCGTGCAGA 2880 
AGATCACCAC CGAAAGCATC GTGATCTGGG GAAAGACTCC TAAGTTCAAG CTGCCCATCC 2940 
AGAAGGAAAC CTGGGAAACC TGGTGGACAG AGTATTGGCA GGCCACCTGG ATTCCTGAGT 3000 
GGGAGTTCGT CAACACCCCT CCCCTGGTGA AGCTGTGGTA CCAGCTGGAG AAGGAGCCCA 3060 
TAGTGGGCGC CGAAACCTTC TACGTGGATG GGGCCGCTAA CAGGGAGACT AAGCTGGGCA 3120 
AAGCCGGATA CGTCACTAAC CGGGGCAGAC AGAAGGTTGT CACCCTCACT GACACCACCA 3180 
ACCAGAAGAC TGAGCTGCAG GCCATTTACC TCGCTTTGCA GGACTCGGGC CTGGAGGTGA 3240 
ACATCGTGAC AGACTCTCAG TATGCCCTGG GCATCATTCA AGCCCAGCCA GACCAGAGTG 3300 
AGTCCGAGCT GGTCAATCAG ATCATCGAGC AGCTGATCAA GAAGGAAAAG GTCTATCTGG 3360 
CCTGGGTACC CGCCCACAAA GGCATTGGCG GCAATGAGCA GGTCGACAAG CTGGTCTCGG 3420 
CTGGCATCAG GAAGGTGCTA TTCCTGGATG GCATCGACAA GGCCCAGGAC GAGCACGAGA 3480 
AATACCACAG CAACTGGCGG GCCATGGCTA GCGACTTCAA CCTGCCCCCT GTGGTGGCCA 3540 
AAGAGATCGT GGCCAGCTGT GACAAGTGTC AGCTCAAGGG CGAAGCCATG CATGGCCAGG 3600 
TGGACTGTAG CCCCGGCATC TGGCAACTCG ATTGCACCCA TCTGGAGGGC AAGGTTATCC 3660 
TGGTAGCCGT CCATGTGGCC AGTGGCTACA TCGAGGCCGA GGTCATTCCC GCCGAAACAG 3720 
GGCAGGAGAC AGCCTACTTC CTCCTGAAGC TGGCAGGCCG GTGGCCAGTG AAGACCATCC 3780 
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ATACTGACAA TGGCAGCAAT TTCACCAGTG 
GAATCAAGCA GGAGTTCGGG ATCCCCTACA 
TGAATAAGGA GTTAAAGAAG ATTATCGGCC 
CCGCGGTCCA AATGGCGGTA TTCATCCACA 
ACAGTGCGGG GGAGCGGATC GTGGACATCA 
AAAAGCAGAT TACCAAGATT CAGAATTTCC 
TCTGGAAAGG CCCAGCGAAG CTCCTCTGGA 
ATAGCGACAT CAAGGTGGTG CCCAGAAGAA 
AGATGGCGGG TGATGATTGC GTGGCGAGCA 



CTACGGTTAA GGCCGCCTGC TGGTGGGCGG 3840 
ATCCCCAGAG TCAGGGCGTC GTCGAGTCTA 3900 
AGGTCAGAGA TCAGGCTGAG CATCTCAAGA 3960 
ATTTCAAGCG GAAGGGGGGG ATTGGGGGGT 4020 
TCGCGACCGA CATCCAGACT AAGGAGCTGC 4080 
GGGTCTACTA CAGGGACAGC AGAAATCCCC 4140 
AGGGTGAGGG GGCAGTAGTG ATCCAGGATA 4200 
AGGCGAAGAT CATTAGGGAT TATGGCAAAC 4260 
GACAGGATGA GGATTAG 4307 



SEQ. ID. NO. 3 - Envelope Gene from HIV-1 MN (Genbank accession no. Ml 7449) 

ATGAGAGTGA AGGGGATCAG GAGGAATTAT CAGCACTGGT GGGGATGGGG CACGATGCTC 60 
CTTGGGTTAT TAATGATCTG TAGTGCTACA GAAAAATTGT GGGTCACAGT CTATTATGGG 120 
GTACCTGTGT GGAAAGAAGC AACCACCACT CTATTTTGTG CATCAGATGC TAAAGCATAT 180 
GATACAGAGG TACATAATGT TTGGGCCACA CAAGCCTGTG TACCCACAGA CCCCAACCCA 240 
CAAGAAGTAG AATTGGTAAA TGTGACAGAA AATTTTAACA TGTGGAAAAA TAACATGGTA 300 
GAACAGATGC ATGAGGATAT AATCAGTTTA TGGGATCAAA GCCTAAAGCC ATGTGTAAAA 360 
TTAACCCCAC TCTGTGTTAC TTTAAATTGC ACTGATTTGA GGAATACTAC TAATACCAAT 420 
AATAGTACTG CTAATAACAA TAGTAATAGC GAGGGAACAA TAAAGGGAGG AGAAATGAAA 480 
AACTGCTCTT TCAATATCAC CACAAGCATA AGAGATAAGA TGCAGAAAGA ATATGCACTT 540 
CTTTATAAAC TTGATATAGT ATCAATAGAT AATGATAGTA CCAGCTATAG GTTGATAAGT 600 
TGTAATACCT CAGTCATTAC ACAAGCTTGT CCAAAGATAT CCTTTGAGCC AATTCCCATA 660 
CACTATTGTG CCCCGGCTGG TTTTGCGATT CTAAAATGTA ACGATAAAAA GTTCAGTGGA 720 
AAAGGATCAT GTAAAAATGT CAGCACAGTA CAATGTACAC ATGGAATTAG GCCAGTAGTA 780 
TCAACTCAAC TGCTGTTAAA TGGCAGTCTA GCAGAAGAAG AGGTAGTAAT TAGATCTGAG 840 
AATTTCACTG ATAATGCTAA AACCATCATA GTACATCTGA ATGAATCTGT ACAAATTAAT 900 
TGTACAAGAC CCAACTACAA TAAAAGAAAA AGGATACATA TAGGACCAGG GAGAGCATTT 960 
TATACAACAA AAAATATAAT AGGAACTATA AGACAAGCAC ATTGTAACAT TAGTAGAGCA 1020 
AAATGGAATG ACACTTTAAG ACAGATAGTT AGCAAATTAA AAGAACAATT TAAGAATAAA 1080 
ACAATAGTCT TTAATCAATC CTCAGGAGGG GACCCAGAAA TTGTAATGCA CAGTTTTAAT 1140 
TGTGGAGGGG AATTTTTCTA CTGTAATACA TCACCACTGT TTAATAGTAC TTGGAATGGT 1200 
AATAATACTT GGAATAATAC TACAGGGTCA AATAACAATA TCACACTTCA ATGCAAAATA 1260 
AAACAAATTA TAAACATGTG GCAGGAAGTA GGAAAAGCAA TGTATGCCCC TCCCATTGAA 1320 
GGACAAATTA GATGTTCATC AAATATTACA GGGCTACTAT TAACAAGAGA TGGTGGTAAG 1380 
GACACGGACA CGAACGACAC CGAGATCTTC AGACCTGGAG GAGGAGATAT GAGGGACAAT 1440 
TGGAGAAGTG AATTATATAA ATATAAAGTA GTAACAATTG AACCAnAGG AGTAGCACCC 1500 
ACCAAGGCAA AGAGAAGAGT GGTGCAGAGA GAAAAAAGAG CAGCGATAGG AGCTCTGTTC 1560 
CTTGGGTTCT TAGGAGCAGC AGGAAGCACT ATGGGCGCAG CGTCAGTGAC GCTGACGGTA 1620 
CAGGCCAGAC TATTATTGTC TGGTATAGTG CAACAGCAGA ACAATTTGCT GAGGGCCATT 1680 
GAGGCGCAAC AGCATATGTT GCAACTCACA GTCTGGGGCA TCAAGCAGCT CCAGGCAAGA 1740 
GTCCTGGCTG TGGAAAGATA CCTAAAGGAT CAACAGCTCC TGGGGTTTTG GGGTTGCTCT 1800 
GGAAAACTCA TTTGCACCAC TACTGTGCCT TGGAATGCTA GTTGGAGTAA TAAATCTCTG 1860 
GATGATATTT GGAATAACAT GACCTGGATG CAGTGGGAAA GAGAAATTGA CAATTACACA 1920 
AGCTTAATAT ACTCATTACT AGAAAAATCG CAAACCCAAC AAGAAAAGAA TGAACAAGAA 1980 
TTATTGGAAT TGGATAAATG GGCAAGTTTG TGGAATTGGT TTGACATAAC AAATTGGCTG 2040 
TGGTATATAA AAATATTCAT AATGATAGTA GGAGGCTTGG TAGGTTTAAG AATAGTTTTT 2100 
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GCTGTACTTT CTATAGTGAA TAGAGTTAGG CAGGGATACT CACCATTGTC GTTGCAGACC 2160 
CGCCCCCCAG TTCCGAGGGG ACCCGACAGG CCCGAAGGAA TCGAAGAAGA AGGTGGAGAG 2220 
AGAGACAGAG ACACATCCGG TCGATTAGTG CATGGATTCT TAGCAATTAT CTGGGTCGAC 2280 
CTGCGGAGCC TGTTCCTCTT CAGCTACCAC CACAGAGACT TACTCTTGAT TGCAGCGAGG 2340 
ATTGTGGAAC TTCTGGGACG CAGGGGGTGG GAAGTCCTCA AATATTGGTG GAATCTCCTA 2400 
CAGTATTGGA GTCAGGAACT AAAGAGTAGT GCTGTTAGCT TGCTTAATGC CACAGCTATA 2460 
GCAGTAGCTG AGGGGACAGA TAGGGTTATA GAAGTACTGC AAAGAGCTGG TAGAGCTATT 2520 
CTCCACATAC CTACAAGAAT AAGACAGGGC TTGGAAAGGG CTTTGCTATA A 2571 



SEQ. I.D. NO. 4 - SYNgp-160mn - codon optimised env sequence 

ATGAGGGTGA AGGGGATCCG CCGCAACTAC CAGCACTGGT GGGGCTGGGG CACGATGCTC 60 
CTGGGGCTGC TGATGATCTG CAGCGCCACC GAGAAGCTGT GGGTGACCGT GTACTACGGC 120 
GTGCCCGTGT GGAAGGAGGC CACCACCACC CTGTTCTGCG CCAGCGACGC CAAGGCGTAC 180 
GACACCGAGG TGCACAACGT GTGGGCCACC CAGGCGTGCG TGCCCACCGA CCCCAACCCC 240 
CAGGAGGTGG AGCTCGTGAA CGTGACCGAG AACTTCAACA TGTGGAAGAA CAACATGGTG 300 
GAGCAGATGC ATGAGGACAT CATCAGCCTG TGGGACCAGA GCCTGAAGCC CTGCGTGAAG 360 
CTGACCCCCC TGTGCGTGAC CCTGAACTGC ACCGACCTGA GGAACACCAC CAACACCAAC 420 
AACAGCACCG CCAACAACAA CAGCAACAGC GAGGGCACCA TCAAGGGCGG CGAGATGAAG 480 
AACTGCAGCT TCAACATCAC CACCAGCATC CGCGACAAGA TGCAGAAGGA GTACGCCCTG 540 
CTGTACAAGC TGGATATCGT GAGCATCGAC AACGACAGCA CCAGCTACCG CCTGATCTCC 600 
TGCAACACCA GCGTGATCAC CCAGGCCTGC CCCAAGATCA GCTTCGAGCC CATCCCCATC 660 
CACTACTGCG CCCCCGCCGG CTTCGCCATC CTGAAGTGCA ACGACAAGAA GTTCAGCGGC 720 
AAGGGCAGCT GCAAGAACGT GAGCACCGTG CAGTGCACCC ACGGCATCCG GCCGGTGGTG 780 
AGCACCCAGC TCCTGCTGAA CGGCAGCCTG GCCGAGGAGG AGGTGGTGAT CCGCAGCGAG 840 
AACTTCACCG ACAACGCCAA GACCATCATC GTGCACCTGA ATGAGAGCGT GCAGATCAAC 900 
TGCACGCGTC CCAACTACAA CAAGCGCAAG CGCATCCACA TCGGCCCCGG GCGCGCCTTC 960 
TACACCACCA AGAACATCAT CGGCACCATC CGCCAGGCCC ACTGCAACAT CTCTAGAGCC 1020 
AAGTGGAACG ACACCCTGCG CCAGATCGTG AGCMGCTGA AGGAGCAGTT CAAGAACAAG 1080 
ACCATCGTGT TCAACCAGAG CAGCGGCGGC GACCCCGAGA TCGTGATGCA CAGCTTCAAC 1140 
TGCGGCGGCG AATTCTTCTA CTGCAACACC AGCCCCCTGT TCAACAGCAC CTGGAACGGC 1200 
AACAACACCT GGAACAACAC CACCGGCAGC AACAACAATA TTACCCTCCA GTGCAAGATC 1260 
AAGCAGATCA TCAACATGTG GCAGGAGGTG GGCAAGGCCA TGTACGCCCC CCCCATCGAG 1320 
GGCCAGATCC GGTGCAGCAG CAACATCACC GGTCTGCTGC TGACCCGCGA CGGCGGCAAG 1380 
GACACCGACA CCAACGACAC CGAAATCTTC CGCCCCGGCG GCGGCGACAT GCGCGACAAC 1440 
TGGAGATCTG AGCTGTACAA GTACAAGGTG GTGACGATCG AGCCCCTGGG CGTGGCCCCC 1500 
ACCAAGGCCA AGCGCCGCGT GGTGCAGCGC GAGAAGCGGG CCGCCATCGG CGCCCTGTTC 1560 
CTGGGCTTCC TGGGGGCGGC GGGCAGCACC ATGGGGGCCG CCAGCGTGAC CCTGACCGTG 1620 
CAGGCCCGCC TGCTCCTGAG CGGCATCGTG CAGCAGCAGA ACAACCTCCT CCGCGCCATC 1680 
GAGGCCCAGC AGCATATGCT CCAGCTCACC GTGTGGGGCA TCAAGCAGCT CCAGGCCCGC 1740 
GTGCTGGCCG TGGAGCGCTA CCTGAAGGAC CAGCAGCTCC TGGGCTTCTG GGGCTGCTCC 1800 
GGCAAGCTGA TCTGCACCAC CACGGTACCC TGGAACGCCT CCTGGAGCAA CAAGAGCCTG 1860 
GACGACATCT GGAACAACAT GACCTGGAT6 CAGTGGGAGC GCGAGATCGA TAACTACACC 1920 
AGCCTGATCT ACAGCCTGCT GGAGAAGAGC CAGACCCAGC AGGAGAAGAA CGAGCAGGAG 1980 
CTGCTGGAGC TGGACAAGTG GGCGAGCCTG TGGAACTGGT TCGACATCAC CAACTGGCTG 2040 
TGGTACATCA AAATCTTCAT CATGATTGTG GGCGGCCTGG TGGGCCTCCG CATCGTGTTC 2100 
GCCGTGCTGA GCATCGTGAA CCGCGTGCGC CAGGGCTACA GCCCCCTGAG CCTCCAGACC 2160 
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CGGCCCCCCG TGCCGCGCGG GCCCGACCGC CCCGAGGGCA TCGAGGAGGA GGGCGGCGAG 2220 
CGCGACCGCG ACACCAGCGG CAGGCTCGTG CACGGCTTCC TGGCGATCAT CTGGGTCGAC 2280 
CTCCGCAGCC TGTTCCTGTT CAGCTACCAC CACCGCGACC TGCTGCTGAT CGCCGCCCGC 2340 
ATCGTGGAAC TCCTAGGCCG CCGCGGCTGG GAGGTGCTGA AGTACTGGTG GAACCTCCTC 2400 
CAGTATTGGA GCCAGGAGCT GAAGTCCAGC GCCGTGAGCC TGCTGAACGC CACCGCCATC 2460 
GCCGTGGCCG AGGGCACCGA CCGCGTGATC GAGGTGCTCC AGAGGGCCGG GAGGGCGATC 2520 
CTGCACATCC CCACCCGCAT CCGCCAGGGG CTCGAGAGGG CGCTGCTGTA A 2571 
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